Winnipeg
- Europe > France (0.15)
- Europe > United Kingdom (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- (28 more...)
- Transportation > Passenger (1.00)
- Transportation > Marine (1.00)
- Leisure & Entertainment > Sports > Football (1.00)
- (4 more...)
WaterSearch: A Quality-Aware Search-based Watermarking Framework for Large Language Models
Lin, Yukang, Shao, Jiahao, Jiang, Shuoran, Zhu, Wentao, Lu, Bingjie, Wu, Xiangping, Siebert, Joanna, Chen, Qingcai
Watermarking acts as a critical safeguard in text generated by Large Language Models (LLMs). By embedding identifiable signals into model outputs, watermarking enables reliable attribution and enhances the security of machine-generated content. Existing approaches typically embed signals by manipulating token generation probabilities. Despite their effectiveness, these methods inherently face a trade-off between detectability and text quality: the signal strength and randomness required for robust watermarking tend to degrade the performance of downstream tasks. In this paper, we design a novel embedding scheme that controls seed pools to facilitate diverse parallel generation of watermarked text. Based on that scheme, we propose WaterSearch, a sentence-level, search-based watermarking framework adaptable to a wide range of existing methods. WaterSearch enhances text quality by jointly optimizing two key aspects: 1) distribution fidelity and 2) watermark signal characteristics. Furthermore, WaterSearch is complemented by a sentence-level detection method with strong attack robustness. We evaluate our method on three popular LLMs across ten diverse tasks. Extensive experiments demonstrate that our method achieves an average performance improvement of 51.01\% over state-of-the-art baselines at a watermark detectability strength of 95\%. In challenging scenarios such as short text generation and low-entropy output generation, our method yields performance gains of 47.78\% and 36.47\%, respectively. Moreover, under different attack senarios including insertion, synonym substitution and paraphrase attasks, WaterSearch maintains high detectability, further validating its robust anti-attack capabilities. Our code is available at \href{https://github.com/Yukang-Lin/WaterSearch}{https://github.com/Yukang-Lin/WaterSearch}.
- Asia > Middle East > Lebanon (0.04)
- Europe > United Kingdom > England (0.04)
- Oceania > Australia (0.04)
- (6 more...)
- Information Technology > Security & Privacy (1.00)
- Energy > Power Industry > Utilities > Nuclear (0.46)
HumaniBench: A Human-Centric Framework for Large Multimodal Models Evaluation
Raza, Shaina, Narayanan, Aravind, Khazaie, Vahid Reza, Vayani, Ashmal, Radwan, Ahmed Y., Chettiar, Mukund S., Singh, Amandeep, Shah, Mubarak, Pandya, Deval
Although recent large multimodal models (LMMs) demonstrate impressive progress on vision language tasks, their alignment with human centered (HC) principles, such as fairness, ethics, inclusivity, empathy, and robustness; remains poorly understood. We present HumaniBench, a unified evaluation framework designed to characterize HC alignment across realistic, socially grounded visual contexts. HumaniBench contains 32,000 expert-verified image question pairs derived from real world news imagery and spanning seven evaluation tasks: scene understanding, instance identity, multiple-choice visual question answering (VQA), multilinguality, visual grounding, empathetic captioning, and image resilience testing. Each task is mapped to one or more HC principles through a principled operationalization of metrics covering accuracy, harmful content detection, hallucination and faithfulness, coherence, cross lingual quality, empathy, and robustness.We evaluate 15 state-of-the-art LMMs under this framework and observe consistent cross model trade offs: proprietary systems achieve the strongest performance on ethics, reasoning, and empathy, while open-source models exhibit superior visual grounding and resilience. All models, however, show persistent gaps in fairness and multilingual inclusivity. We further analyze the effect of inference-time techniques, finding that chain of thought prompting and test-time scaling yield 8 to 12 % improvements on several HC dimensions. HumaniBench provides a reproducible, extensible foundation for systematic HC evaluation of LMMs and enables fine-grained analysis of alignment trade-offs that are not captured by conventional multimodal benchmarks. https://vectorinstitute.github.io/humanibench/
- North America > Canada > Ontario > Toronto (0.04)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (14 more...)
- Media > News (1.00)
- Leisure & Entertainment > Sports (1.00)
- Law (1.00)
- (3 more...)
Communication-Efficient Learning for Satellite Constellations
Tudose, Ruxandra-Stefania, Grüss, Moritz H. W., Kim, Grace Ra, Johansson, Karl H., Bastianello, Nicola
Satellite constellations in low-Earth orbit are now widespread, enabling positioning, Earth imaging, and communications. In this paper we address the solution of learning problems using these satellite constellations. In particular, we focus on a federated approach, where satellites collect and locally process data, with the ground station aggregating local models. We focus on designing a novel, communication-efficient algorithm that still yields accurate trained models. To this end, we employ several mechanisms to reduce the number of communications with the ground station (local training) and their size (compression). We then propose an error feedback mechanism that enhances accuracy, which yields, as a byproduct, an algorithm-agnostic error feedback scheme that can be more broadly applied. We analyze the convergence of the resulting algorithm, and compare it with the state of the art through simulations in a realistic space scenario, showcasing superior performance.
- Europe > Spain > Aragón (0.04)
- North America > United States > Florida > Hillsborough County > University (0.04)
- North America > Canada > Manitoba > Winnipeg Metropolitan Region > Winnipeg (0.04)
- (3 more...)
- Education (0.70)
- Information Technology > Security & Privacy (0.68)
- Food & Agriculture > Agriculture (0.46)
Advancing Multi-Agent RAG Systems with Minimalist Reinforcement Learning
Wu, Yihong, Ma, Liheng, Li, Muzhi, Zhou, Jiaming, Ding, Lei, Hao, Jianye, Leung, Ho-fung, King, Irwin, Zhang, Yingxue, Nie, Jian-Yun
Large Language Models (LLMs) equipped with modern Retrieval-Augmented Generation (RAG) systems often employ multi-turn interaction pipelines to interface with search engines for complex reasoning tasks. However, such multi-turn interactions inevitably produce long intermediate contexts, as context length grows exponentially with exploration depth. This leads to a well-known limitation of LLMs: their difficulty in effectively leveraging information from long contexts. This problem is further amplified in RAG systems that depend on in-context learning, where few-shot demonstrations must also be included in the prompt, compounding the context-length bottleneck. To address these challenges, we propose Mujica-MyGo, a unified framework for efficient multi-turn reasoning in RAG. Inspired by the divide-and-conquer principle, we introduce Mujica (Multi-hop Joint Intelligence for Complex Question Answering), a multi-agent RAG workflow that decomposes multi-turn interactions into cooperative sub-interactions, thereby mitigating long-context issues. To eliminate the dependency on in-context learning, we further develop MyGO (Minimalist Policy Gradient Optimization), a lightweight and efficient reinforcement learning algorithm that enables effective post-training of LLMs within complex RAG pipelines. We provide theoretical guarantees for MyGO's convergence to the optimal policy. Empirical evaluations across diverse question-answering benchmarks, covering both text corpora and knowledge graphs, show that Mujica-MyGO achieves superior performance.
- North America > Canada > Quebec > Montreal (1.00)
- Africa > Namibia (0.15)
- Asia > South Korea > Seoul > Seoul (0.04)
- (25 more...)
- Research Report (0.51)
- Workflow (0.48)
Physics-Guided Conditional Diffusion Networks for Microwave Image Reconstruction
Chehelgami, Shirin, LoVetri, Joe, Khoshdel, Vahab
Abstract--A conditional latent-diffusion based framework for solving the electromagnetic inverse scattering problem associated with microwave imaging is introduced. This generative machine-learning model explicitly mirrors the non-uniqueness of the ill-posed inverse problem. Unlike existing inverse solvers utilizing deterministic machine learning techniques that produce a single reconstruction, the proposed latent-diffusion model generates multiple plausible permittivity maps conditioned on measured scattered-field data, thereby generating several potential instances in the range-space of the non-unique inverse mapping. A forward electromagnetic solver is integrated into the reconstruction pipeline as a physics-based evaluation mechanism. The space of candidate reconstructions form a distribution of possibilities consistent with the conditioning data and the member of this space yielding the lowest scattered-field data discrepancy between the predicted and measured scattered fields is reported as the final solution. Synthetic and experimental labeled datasets are used for training and evaluation of the model. An innovative labeled synthetic dataset is created that exemplifies a varied set of scattering features. Training of the model using this new dataset produces high-quality permittivity reconstructions achieving improved generalization with excellent fidelity to shape recognition. UANTIT A TIVE Microwave Imaging (MWI) is a noninvasive technique with applications in a wide range of areas such as medical diagnostics, and non-destructive testing in agricultural and industrial settings [1]-[3].
- North America > Canada > Manitoba > Winnipeg Metropolitan Region > Winnipeg (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
Cost-Aware Retrieval-Augmentation Reasoning Models with Adaptive Retrieval Depth
Hashemi, Helia, Rühle, Victor, Rajmohan, Saravan
Reasoning models have gained significant attention due to their strong performance, particularly when enhanced with retrieval augmentation. However, these models often incur high computational costs, as both retrieval and reasoning tokens contribute substantially to the overall resource usage. In this work, we make the following contributions: (1) we propose a retrieval-augmented reasoning model that dynamically adjusts the length of the retrieved document list based on the query and retrieval results; (2) we develop a cost-aware advantage function for training of efficient retrieval-augmented reasoning models through reinforcement learning; and (3) we explore both memory- and latency-bound implementations of the proposed cost-aware framework for both proximal and group relative policy optimization algorithms. We evaluate our approach on seven public question answering datasets and demonstrate significant efficiency gains, without compromising effectiveness. In fact, we observed that the model latency decreases by ~16-20% across datasets, while its effectiveness increases by ~5% on average, in terms of exact match.
- North America > United States > District of Columbia > Washington (0.05)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Spain > Canary Islands > Tenerife (0.04)
- (11 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
- Europe > France (0.15)
- Europe > United Kingdom (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- (28 more...)
- Transportation > Passenger (1.00)
- Transportation > Marine (1.00)
- Leisure & Entertainment > Sports > Football (1.00)
- (4 more...)
Artificial Authority: From Machine Minds to Political Alignments. An Experimental Analysis of Democratic and Autocratic Biases in Large-Language Models
Ożegalska-Łukasik, Natalia, Łukasik, Szymon
Political beliefs vary significantly across different countries, reflecting distinct historical, cultural, and institutional contexts. These ideologies, ranging from liberal democracies to rigid autocracies, influence human societies, as well as the digital systems that are constructed within those societies. The advent of generative artificial intelligence, particularly Large Language Models (LLMs), introduces new agents in the political space-agents trained on massive corpora that replicate and proliferate socio-political assumptions. This paper analyses whether LLMs display propensities consistent with democratic or autocratic world-views. We validate this insight through experimental tests in which we experiment with the leading LLMs developed across disparate political contexts, using several existing psychometric and political orientation measures. The analysis is based on both numerical scoring and qualitative analysis of the models' responses. Findings indicate high model-to-model variability and a strong association with the political culture of the country in which the model was developed. These findings highlight the need for more detailed examination of the socio-political dimensions embedded within AI systems.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Austria > Vienna (0.14)
- Europe > Russia (0.05)
- (19 more...)
- Questionnaire & Opinion Survey (1.00)
- Research Report > New Finding (0.93)
Can an AI-Powered Presentation Platform Based On The Game "Just a Minute" Be Used To Improve Students' Public Speaking Skills?
This study explores the effectiveness of applying AI and gamification into a presentation platform aimed at University students wanting to improve their public speaking skills in their native tongue. Specifically, a platform based on the radio show, Just a Minute (JAM), is explored. In this game, players are challenged to speak fluently on a topic for 60 seconds without repeating themselves, hesitating or deviating from the topic. JAM has proposed benefits such as allowing students to improve their spontaneous speaking skills and reduce their use of speech disfluencies ("um", "uh", etc.). Previous research has highlighted the difficulties students face when speaking publicly, the main one being anxiety. AI Powered Presentation Platforms (AI-PPPs), where students can speak with an immersive AI audience and receive real-time feedback, have been explored as a method to improve student's speaking skills and confidence. So far they have shown promising results which this study aims to build upon. A group of students from the University of York are enlisted to evaluate the effectiveness of the JAM platform. They are asked to fill in a questionnaire, play through the game twice and then complete a final questionnaire to discuss their experiences playing the game. Various statistics are gathered during their gameplay such as the number of points they gained and the number of rules they broke. The results showed that students found the game promising and believed that their speaking skills could improve if they played the game for longer. More work will need to be carried out to prove the effectiveness of the game beyond the short term.
- North America > United States (0.04)
- North America > Canada > Manitoba > Winnipeg Metropolitan Region > Winnipeg (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland (0.04)
- Research Report > New Finding (1.00)
- Questionnaire & Opinion Survey (1.00)
- Education > Educational Setting (1.00)
- Leisure & Entertainment > Games > Computer Games (0.68)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)